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1. Introduction 

The quantum probability law tv(Ep) (its so-called trace-rule form) is one of the 
fundamental pillars of modern physics along with Einstein's famous energy formula 
E = mc 2 and Boltzmann's immortal entropy expression S = klogW. Gleason gave a 
seminal derivation of the quantum probability law in his theorem pQ. Nevertheless, as 
to transparentness, there is much to be desired. Though the quantum probability law 
looks simple, there are "wheels within wheels" in it. Therefore, it is important to view 
it from as many different angles as possible to be able to comprehend the intricacies 
involved in it. 

A number of alternative derivations appeared in the literature. Let me mention 
just a few. 

(i) The approaches based on the so-called eigenvalue-eigenstate link [2], [3], [I]; 

(ii) The decision-theoretic approaches [5], [6], [7], [8]; 

(iii) Derivation from operational assumptions [9]; 

(iv) The approach via entanglement. 

The last mentioned approach went under the title "Born's rule from envariance" 
(environment assisted invariance). There were 4 articles by Zurek [TO], [H], [12], |13j . 
who invented the approach, and there were 4 more articles by commentators [14] . j!5j . 
|16j . [17j . and finally my own contribution in terms of a complete theory of twin unitaries 
(the other face of envariance) [18] . The first 8 articles had two restrictions in establishing 
essentially the trace rule tr(Ep) for probability, where E was an event (projector), 
and p was the subsystem density operator: they handled only improper mixtures [19J, 
and did not go beyond the commutation [E, p] = restriction. 

My article emphasized the role of u-additivity in the derivations from entanglement 
(the sole assumption in Gleason's theorem). I suggested to surmount the commutation 
restriction by taking resort to minimal quantum non- demolition (QND) measurement. 

Subsequently I have realized that minimal measurement is by itself sufficient to 
derive the entire trace rule. It has the advantage that it does not require the o- 
additivity assumption, and thus it is complementary to Gleason's theorem [lj. This 
article is devoted to the exposition of the minimal-measurement approach. 

The paper is based on the idea that probabilities are predictions for the statistical 
weights of definite-result sub-ensembles in measurement. These are, in the end, detected 
as relative frequencies. 

2. Assumptions of the Derivation 

We are dealing with an arbitrary observable A that has a purely discrete spectrum 
{a n : Vn}. We write it in spectral form 

A = XI a nPn, n^n =>- a n ^ a n >. (1) 
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2.1. The assumptions 

The assumptions of the approach read as follows. 

(i) States are described by density operators p. 

By "state" we mean an ensemble of quantum systems prepared by a certain 
procedure. Any measurement converts the initial state p into a final state p' (in 
the so-called non-selective version, when the entire ensemble is considered). The latter 
is decomposable into states p' n that correspond to the different results a n of A: 

p' = J2w n p' n , \/n: w n > 0, ^w n = l. (2) 

n n 

If the measurement is not a QND one, then the states {p' n : Wn, w n > 0} need 
not be in any simple relation to A. They correspond to definite pointer positions 
on the measuring instrument (which we make no use of in this approach). The 
statistical weights w n apply both to the states p' n of the selective version (in which 
definite results are considered), and to the corresponding pointer positions. By the very 
definition of measurement, the weights equal the probabilities: 

\/n: w n = p(a n , A, p) (3) 

(in obvious notation). In other words, as it was stated in the Introduction, the proba- 
bilities p(a n ,A,p) are understood to be the predictions for the statistical weights w n , 
which become relative frequences when the measurement is performed on the individual 
systems that make up the ensemble. 

QND measurement, by definition, converts an initial state p into a final state p' , 
which has two properties: 

(a) The states p' n that determine the terms in decomposition (2) are dispersion- 
free with respect to the observable A: 

Vn,w n >0: p(a n , A, p' n ) = 1. (4) 

(b) If the initial state p is itself dispersion-free with respect to A : 3n : 
p(a n ,A,p) = 1, then so is the final state, and the sharp value of A is the same: 
p' = p' n , but, in general, the initial and the final states need not be equal. (Earlier used 
synonyms for "non-demolition" were "repeatable", "predictive", "first-kind", etc.) 

(ii) Further, we assume that if and only if a state p satisfies 

tr(P n p) = 1, (5a) 

then the probability p(a n ,A,p) of the value a n of the observable A in this state is 
1. In other words, we assume the validity of the trace rule for probability-one events. 
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It is proved in Appendix A that (5a) is (mathematically) equivalent to 

PnpPn = P- (56) 

Let us denote by p" n any state that has the sharp value a n of A : 
p(a n , A, p") = 1, and let us consider the family of all mixtures 

P" = Yl V ^Pni Vri : V n > 0; ^V n = 1. (6) 

n n 

An immediate consequence of (5b) is that decomposition (6) can be rewritten as 

P = J2 V n P nPn P n, 

n 

which, on account of the orthogonality and idempotency of the eigen-projectors 
P n P n > = S n ,n>Pn, implies 

P " = Y.PnP"Pn. (7) 

n 

Since (7) is obviously sufficient for (6), also (7) characterizes states that are mixtures of 
states with definite values of A. 

If an initial state p and an observable (1) are given, then a subset of the family 
of states (7) are final states of QND measurements. 

Our next-to-last assumption is: 

(iii) The state p" in the family of states (7) that is closest to the initial state 
p is the final state of a QND measurement of the observable A. By this, "closest" is 
meant in the sense of minimal distance, where distance is taken in the Hilbert space 
Tins of all Hilbert-Schmidt (HS) operators ( cf [20] and Appendix B below). All density 
operators are HS operators. 

In general, also in a proper subset of Tins, i n the set of all trace-class operators, 
for which by definition tip < oo, distance is mathematically defined. We take distance 
in Tins due to Lemma-C in Appendix C. 

Our last assumption is 

(iv) The probabilities p(a n ,A,p) are the same in all measurements of A in p. 
2.2. Discussion of the assumptions 

Assumptions (i) and (iv) have a basic (almost axiomatic) position in the conceptual 
structure of quantum mechanics. 

Assumption (ii) stipulates the trace law for events that are certain. Here we are on 
similar grounds as Zurek was [10j-[13j, when he set out to derive Born's rule assuming 
its validity for events that are certain. (In [18J though, when the full power of envariance 
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was made use of, the trace law under the restriction [E, p] = was derived with no 
probability- law assumption to start with.) 

Assumption (iii) can be viewed as the definition of minimal (or minimal- 
disturbance) QND measurement. Namely, "closest" can be understood as "minimally 
changed" . 

In the next section we derive p" , and thus we obtain the probabilities. 
3. Derivation of the trace rule 

We adapt now a former derivation [21] of the Liiders formula [22J to the present purpose. 
The argument is very simple. It is based on three almost evident remarks: 

Remark 1. The super-operator Pa = ^ n P n ---P n (cf (1)) is a projector in 
Tins- (The dots show the place where any HS operator B G Hhs should be in the 
sum of products when Pa is applied to it). One easily shows the claimed Hermiticity 
and idempotency of Pa in Hhs (cf Appendix B). 

Let us denote by Sa the subspace of Tins onto which Pa projects. 

Remark 2. As it is obvious from (7), each density operator p" from the family 
(6) (or (7)) is an element of Sa- And conversely, the family (6) consists of all density 
operators that are in Sa- 

Remark 3. If p is a density operator, then so is its projection Pa{p) (as easily 
seen) . 

If p is an arbitrary initial state, its closest element in Sa is its projection into 
Sa (cf Appendix D). The projection is a density operator on account of Remark 3. 
The projection belongs to the family (6) owing to Remark 2. Relation (3) implies that 
the weights in the projection give the probabilities. 

Finally, let us write down the projection. 



This is the well-known formula of Liiders, which gives the change of state in minimal 
QND measurement (also called ideal measurement) |22j . 

Making the weights in the preceding relation explicit, one obtains 



P A (p)=Y, P npPn- 



n 




(8) 



n 



Relations (3) and (8) give our final result: 



Vp, Wn : p(a n ,A,p) = tr(P n p). 



(9) 
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In this way the trace-rule form of the quantum probability law is derived. 

Incidentally, if the event is elementary (mathematically, a ray projector) 
P n =| (j))(4> |, then the quantum probability law is known in the form (</> | p \ (j>). 
If also the state is pure (mathematically also a ray projector) p =\ip)(ip\, then one has 
the transition-probability form | (<p\ \ip) | 2 . (All this obviously follows from the trace rule.) 

Appendix A 

We prove now the following auxiliary result that sheds light on assumption (ii). 

Lemma-A If p and P are a density operator and a projector respectively, then 
tr(pP) = 1 is equivalent to PpP = p . 

Proof. It is obvious (by taking the trace) that the latter relation implies the former. 
Claim of the inverse implication is not quite trivial. 

Since every density operator is a trace-class operator, it has a finite or countably 
infinite discrete positive spectrum {r^ : Vi} (with possible repetitions in the 
eigenvalues). Hence, it can be written in spectral form as 

p = E^ l*X*l> ( A1 ) 

i 

where \i) is an eigenvector corresponding to the eigenvalue r^. 

The relation tr(pP) = 1 implies tr(pP ± ) = ( P 1 - = 1 — P ). Substituting 
(A.l) in the latter relation, one obtains J2i r i(i I P ± M) = 0. On account of the 
positivity Vz : > 0, and the easily seen non-negativity Vz : (i \ P 1 - \ i) > 0, one 
further has Vz : = (i | P 1 - \ i) = {{P 1 - \ i}\\ 2 , as well as Vi : P 1 - \i) = 0, and 
Vz : P \i) Then, applying P . . . P to (A.l), one obtains the second relation in 

Lemma-A. □ 



Appendix B 

By definition, linear operators A in a complex separable Hilbert space are Hilbert- 
Schmidt ones if tr^'fyl) < oo {A* being the adjoint of A ). The scalar product in 
the Hilbert space Tins of all linear Hilbert-Schmidt operators is (^A, B^j = tr(A' i B) 
(cf the Definition after Theorem VI. 21 and problem VI. 48(a) in [20]). 



Appendix C 

Let 7i be a separable, complex Hilbert space, and Tins the Hilbert space of all 
linear Hilbert-Schmidt operators in it (cf Appendix B). Let, further, | ip), and | <p) 
be two arbitrary unit vectors in 7i. The square of the distance between them in 7i 
is 

dn( |0>)] 2 = || m~ |0)|| 2 = ((VI -(01 )( IV)- 10)) = 2-2i?e((0||V)). {CI) 



It depends on the relative phase between the two vectors. 
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Definition-C (i) We make the convention that, whenever the distance between 
two unit vectors in TL is in question, it is understood that the relative phase is chosen 
so that the distance in (C.l) is minimal, i. e., that 

>0. (C.2) 

(ii) We use the word "closer" in the sense of "not farther", i. e., as <, and not 
as <. 



Lemma-C Let \ ip), \ (f>), and | x) De three arbitrary unit vectors in H. 
Then, taking the phase factors of \<p) and | x) m accordance with Definition-C (i), 
the former is closer than the latter to the state vector | ip) in 7i, if and only if the 
corresponding pure state | (/>)((/> | is closer than to in Tins- in 

other words, closer in 7i (observing Definition-C (i)) is the case if and only if it is 
true for the corresponding ray projectors in Tins- 

Proof. In view of (C.l) and Definition-C (i), |0) is closer to \ip) than \x) is 
to \ip) if and only if 

(2-2|(0ii^)i) <(2-2|( X iiv)i) # mm\ >\(x\mi (c.s) 

On the other hand, one has 

^(lV')(^|,|0)(0l)] 2 = tr[(|^)(V'l - |0)(0|) 2 ] =2-2|(0||^)| 2 . (C.4) 

Hence, the pure state \4>)(4 ) \ is "closer" to IV 7 ) (01 than is to in 

Hhs if an d only if 

\(m)\ 2 >\(xU)\ 2 - 

Finally, since an inequality between two non-negative numbers holds true if and 
only if the same inequality is valid between their squares, one can see from (C.3) and 
(C.4) that Lemma-C is proved. □ 

Appendix D 

Now we prove (for completeness) a very elementary auxiliary lemma. 

Lemma-D Let 7i and S be a separable (finite or infinite dimensional) complex 
Hilbert space and a subspace in it respectively. Let, further, P be the projector onto 
S. For every element a G 7i, there is a unique element b G S that is closest to a 
among all elements b G 7i. It is b = Pa. By this, "closest" is meant in the sense of 
minimal distance | \a — b\\. 

Proof. For every a G H, and every b G 5, one can utilize the orthogonality 
between the vectors from the orthocomplement of S and those from S itself: 

||a-fe|| 2 = \\(a - Pa) + {Pa - b)\\ 2 = \ \a - Pa\\ 2 + \\Pa - b\\ 2 . 

This is minimal with respect to the choice of b G S if and only if b = Pa because 
whenever b G S, b^ Pa, \ \Pa - b\ | 2 > 0. □ 
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